Search CORE

618 research outputs found

Neurostream: Scalable and Energy Efficient Deep Learning with Smart Memory Cubes

Author: Azarkhish Erfan
Benini Luca
Loi Igor
Rossi Davide
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/09/2017
Field of study

open4siHigh-performance computing systems are moving towards 2.5D and 3D memory hierarchies, based on High Bandwidth Memory (HBM) and Hybrid Memory Cube (HMC) to mitigate the main memory bottlenecks. This trend is also creating new opportunities to revisit near-memory computation. In this paper, we propose a flexible processor-in-memory (PIM) solution for scalable and energy-efficient execution of deep convolutional networks (ConvNets), one of the fastest-growing workloads for servers and high-end embedded systems. Our co-design approach consists of a network of Smart Memory Cubes (modular extensions to the standard HMC) each augmented with a many-core PIM platform called NeuroCluster. NeuroClusters have a modular design based on NeuroStream coprocessors (for Convolution-intensive computations) and general-purpose RISC-V cores. In addition, a DRAM-friendly tiling mechanism and a scalable computation paradigm are presented to efficiently harness this computational capability with a very low programming effort. NeuroCluster occupies only 8 percent of the total logic-base (LoB) die area in a standard HMC and achieves an average performance of 240 GFLOPS for complete execution of full-featured state-of-the-art (SoA) ConvNets within a power budget of 2.5 W. Overall 11 W is consumed in a single SMC device, with 22.5 GFLOPS/W energy-efficiency which is 3.5X better than the best GPU implementations in similar technologies. The minor increase in system-level power and the negligible area increase make our PIM system a cost-effective and energy efficient solution, easily scalable to 955 GFLOPS with a small network of just four SMCs.openAzarkhish, Erfan*; Rossi, Davide; Loi, Igor; Benini, LucaAzarkhish, Erfan*; Rossi, Davide; Loi, Igor; Benini, Luc

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A Hybrid Instruction Prefetching Mechanism for Ultra Low-Power Multicore Clusters

Author: Azarkhish Erfan
Benini Luca
Loi Igor
Payami Maryam*
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

The instruction memory hierarchy plays a critical role in performance and energy efficiency of ultralow-power (ULP) processors for the Internet-of-Things (IoT) end-nodes. This is mainly due to the extremely tight power envelope and area budgets, which imply small instruction-caches (I-Cache) operating at very low supply voltages (near-threshold). The challenge is aggravated by the fact that multiple processors, fetching in parallel, require plenty of bandwidth from the I-Caches. In this letter, we propose a low-cost and energy efficient hybrid instruction-prefetching mechanism to be integrated with a ULP multicore cluster. We study its performance for a wide range of IoT applications, from cryptography to computer vision, and show that it can effectively improve the hit-rate of almost all of them to above 95% (average performance improvement of over 2 \times ). In addition, we designed our prefetcher and integrated it in a 4-cores cluster in 28 nm fully-depleted silicon-on-insulator (FDSOI) technology. We show that system's power consumption increases only by about 11% and silicon area by less than 1%. Altogether, a total energy reduction of 1.9x is achieved, thanks to more than 2x performance improvement, enabling a significantly longer battery life

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Scalable Hierarchical Instruction Cache for Ultra-Low-Power Processors Clusters

Author: Benini Luca
Chen Jie
Flamand Eric
Loi Igor
Rossi Davide
Tagliavini Giuseppe
Publication venue
Publication date: 03/09/2023
Field of study

High Performance and Energy Efficiency are critical requirements for Internet of Things (IoT) end-nodes. Exploiting tightly-coupled clusters of programmable processors (CMPs) has recently emerged as a suitable solution to address this challenge. One of the main bottlenecks limiting the performance and energy efficiency of these systems is the instruction cache architecture due to its criticality in terms of timing (i.e., maximum operating frequency), bandwidth, and power. We propose a hierarchical instruction cache tailored to ultra-low-power tightly-coupled processor clusters where a relatively large cache (L1.5) is shared by L1 private caches through a two-cycle latency interconnect. To address the performance loss caused by the L1 capacity misses, we introduce a next-line prefetcher with cache probe filtering (CPF) from L1 to L1.5. We optimize the core instruction fetch (IF) stage by removing the critical core-to-L1 combinational path. We present a detailed comparison of instruction cache architectures' performance and energy efficiency for parallel ultra-low-power (ULP) clusters. Focusing on the implementation, our two-level instruction cache provides better scalability than existing shared caches, delivering up to 20\% higher operating frequency. On average, the proposed two-level cache improves maximum performance by up to 17\% compared to the state-of-the-art while delivering similar energy efficiency for most relevant applications.Comment: 14 page

arXiv.org e-Print Archive

An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics

Author: Benini Luca
Conti Francesco
Gautschi Michael
Gürkaynak Frank Kagan
Haugou Germain
Loi Igor
Mangard Stefan
Muehlberghuber Michael
Pullini Antonio
Rossi Davide
Schiavone Pasquale Davide
Schilling Robert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/04/2017
Field of study

Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data security issues. To cope with the combined workload of analytics and encryption in a tight power envelope, we propose Fulmine, a System-on-Chip based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks. The Fulmine SoC, fabricated in 65nm technology, consumes less than 20mW on average at 0.8V achieving an efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to 25MIPS/mW in software. As a strong argument for real-life flexible application of our platform, we show experimental results for three secure analytics use cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with secured remote recognition in 5.74pJ/op; and seizure detection with encrypted data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE Transactions on Circuits and Systems - I: Regular Paper

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A -1.8V to 0.9V body bias, 60 GOPS/W 4-core cluster in low-power 28nm UTBB FD-SOI technology

Author: Benini Luca
Flatresse Philippe
Gautschi Michael
Gurkaynak Frank Kagan
Loi Igor
Pullini Antonio
Rossi Davide
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

A 4-core cluster fabricated in low power 28nm UTBB FD-SOI conventional well technology is presented. The SoC architecture enables the processors to operate 'on-demand' on a 0.44V (1.8MHz) to 1.2V (475MHz) supply voltage wide range and -1.2V to 0.9V body bias wide range achieving the peak energy efficiency of 60 GOPS/W, (419\u3bcW, 6.4MHz) at 0.5V with 0.5V forward body bias. The proposed SoC energy efficiency is 1.4x to 3.7x greater than other low-power processors with comparable performance

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Characterization and Implementation of Fault-Tolerant Vertical Links for 3-D Networks-on-Chip

Author: Angiolini Federico
Benini Luca
Fujita Shinobu
Loi Igor
Mitra Subhasish
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/12/2011
Field of study

Through silicon vias (TSVs) provide an efficient way to support vertical communication among different layers of a vertically stacked chip, enabling scalable 3-D networks-on-chip (NoC) architectures. Unfortunately, low TSV yields significantly impact the feasibility of high-bandwidth vertical connectivity. In this paper, we present a semi-automated design flow for 3-D NoCs including a defect-tolerance scheme to increase the global yield of 3-D stacked chips. Starting from an accurate physical and geometrical model of TSVs: 1) we extract a circuit-level model for vertical interconnections; 2) we use it to evaluate the design implications of extending switch architectures with ports in the vertical direction; moreover, 3) we present a defect-tolerance technique for TSV-based multi-bit links through an effective use of redundancy; and finally, 4) we present a design flow allowing for post-layout simulation of NoCs with links in all three physical dimensions. Experimental results show that a 3-D NoC implementation yields around 10% frequency improvement over a 2-D one, thanks to the propagation delay advantage of TSVs and the shorter links. In addition, the adopted fault tolerance scheme demonstrates a significant yield improvement, ranging from 66% to 98%, with a low area cost (20.9% on a vertical link in a NoC switch, which leads a modest 2.1% increase in the total switch area) in 130 nm technology, with minimal impact on very large-scale integrated design and test flows

Infoscience - École polytechnique fédérale de Lausanne

High Performance Ambipolar Field-Effect Transistor of Random Network Carbon Nanotubes

Author: Bisri Satria Zulkarnaen
Derenskyi Vladimir
Gao Jia
Gomulya Widianta
Gordiichuk Pavlo
Herrmann Andreas
Iezhokin Igor
Loi Maria Antonietta
Publication venue: 'Wiley'
Publication date: 04/12/2012
Field of study

Ambipolar field-effect transistors of random network carbon nanotubes are fabricated from an enriched dispersion utilizing a conjugated polymer as the selective purifying medium. The devices exhibit high mobility values for both holes and electrons (3 cm(2)/V.s) with a high on/off ratio (10(6)). The performance demonstrates the effectiveness of this process to purify semiconducting nanotubes and to remove the residual polymer

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Overall Survival With Palbociclib And Fulvestrant in Women With HR+/HER2– ABC: Updated Exploratory Analyses of PALOMA-3, a Double-Blind, Phase 3 Randomized Study

Author: André Fabrice
Bananis Eustratios
Bondarenko Igor
Colleoni Marco
Cristofanilli Massimo
DeMichele Angela
Frean Maria Jose Lechuga
Harbeck Nadia
Huang Xin
Im Seock-Ah
Iwata Hiroji
Liu Yuan
Loi Sherene
Loibl Sibylle
Masuda Norikazu
O’Leary Ben
Rugo Hope S.
Sindy Kim Sindy Kim
Slamon Dennis J.
Turner Nicholas C.
Publication venue: 'American Association for Cancer Research (AACR)'
Publication date: 15/08/2022
Field of study

Purpose: To conduct an updated exploratory analysis of overall survival (OS) with a longer median follow-up of 73.3 months and evaluate the prognostic value of molecular analysis by circulating tumor DNA (ctDNA). Patients and methods: Patients with hormone receptor−positive/human epidermal growth factor receptor 2−negative (HR+/HER2−) advanced breast cancer (ABC) were randomized 2:1 to receive palbociclib (125 mg orally/d; 3/1 week schedule) and fulvestrant (500 mg intramuscularly) or placebo and fulvestrant. This OS analysis was performed when 75% of enrolled patients died (393 events in 521 randomized patients). ctDNA analysis was performed among patients who provided consent. Results: At the data cutoff (August 17, 2020), 258 and 135 deaths occurred in the palbociclib and placebo groups, respectively. The median OS (95% CI) was 34.8 months (28.8−39.9) in the palbociclib group and 28.0 months (23.5−33.8) in the placebo group (stratified hazard ratio 0.81; 95% CI, 0.65−0.99). The 6-year OS rate (95% CI) was 19.1% (14.9−23.7) and 12.9% (8.0−19.1) in the palbociclib and placebo groups, respectively. Favorable OS with palbociclib plus fulvestrant compared with placebo plus fulvestrant was observed in most subgroups, particularly in patients with endocrine-sensitive disease, no prior chemotherapy for ABC, low circulating tumor fraction, and regardless of ESR1, PIK3CA, or TP53 mutation status. No new safety signals were identified. Conclusions: The clinically meaningful improvement in OS associated with palbociclib plus fulvestrant was maintained with >6 years of follow-up in patients with HR+/HER2− ABC, supporting palbociclib plus fulvestrant as a standard of care in these patients. Trial Registration: ClinicalTrials.gov Identifer: NCT0194213

Repository for DZ "DMA"

Overall Survival with Palbociclib and Fulvestrant in Women with HR+/HER2− ABC: Updated Exploratory Analyses of PALOMA-3, a Double-blind, Phase III Randomized Study

Author: André Fabrice
Bananis Eustratios
Bondarenko Igor
Colleoni Marco
Cristofanilli Massimo
DeMichele Angela
Frean Maria Jose Lechuga
Harbeck Nadia
Huang Xin
Im Seock-Ah
Iwata Hiroji
Kim Sindy
Loi Sherene
Loibl Sibylle
Masuda Norikazu
O’Leary Ben
Rugo Hope S.
Slamon Dennis J.
Turner Nicholas C.
Yuan Liu Yuan Liu
Publication venue: 'American Association for Cancer Research (AACR)'
Publication date: 01/01/2022
Field of study

Purpose: To conduct an updated exploratory analysis of overall survival (OS) with a longer median follow-up of 73.3 months and evaluate the prognostic value of molecular analysis by circulating tumor DNA (ctDNA). Patients and Methods: Patients with hormone receptor–positive/ human epidermal growth factor receptor 2–negative (HRþ/HER2) advanced breast cancer (ABC) were randomized 2:1 to receive palbociclib (125 mg orally/day; 3/1 week schedule) and fulvestrant (500 mg intramuscularly) or placebo and fulvestrant. This OS analysis was performed when 75% of enrolled patients died (393 events in 521 randomized patients). ctDNA analysis was performed among patients who provided consent. Results: At the data cutoff (August 17, 2020), 258 and 135 deaths occurred in the palbociclib and placebo groups, respectively. The median OS [95% confidence interval (CI)] was 34.8 months (28.8–39.9) in the palbociclib group and 28.0 months (23.5–33.8) in the placebo group (stratified hazard ratio, 0.81; 95% CI, 0.65– 0.99). The 6-year OS rate (95% CI) was 19.1% (14.9–23.7) and 12.9% (8.0–19.1) in the palbociclib and placebo groups, respectively. Favorable OS with palbociclib plus fulvestrant compared with placebo plus fulvestrant was observed in most subgroups, particularly in patients with endocrine-sensitive disease, no prior chemotherapy for ABC and low circulating tumor fraction and regardless of ESR1, PIK3CA, or TP53 mutation status. No new safety signals were identified. Conclusions: The clinically meaningful improvement in OS associated with palbociclib plus fulvestrant was maintained with >6 years of follow-up in patients with HRþ/HER2 ABC, supporting palbociclib plus fulvestrant as a standard of care in these patients

Repository for DZ "DMA"

Design Issues and Considerations for Low-Cost 3-D TSV IC Technology

Author: Abdelkarim Mercha
Alain Phommahaxay
Ann Opdebeeck
Antonio Pullini
Bart De Wachter
Bart Vandevelde
Cristina Torregiani
Dan Perry
Dimitri Linten
Dimitrios Velenis
Eric Beyne
Federico Angiolini
Geert Van der Plas
Guruprasad Katti
Herman Oprins
Igor Loi
Ingrid De Wolf
Jan Van Olmen
Luca Benini
Marc Nelis
Michal Rakowski
Michele Stucchi
Miro Cupac
Morin Dehan
Muriel de Potter de ten Broeck
Nikolaos Minas
Paresh Limaye
Paul Marchal
Rahul Agarwal
Riet Labie
Stephane Bronckers
Steven Thijs
Veerle Simons
Vladimir Cherman
Wim Dehaene
Wouter Ruythooren
Youssef Travaly
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref